KUNLP System for NTCIR-3 English-Korean Cross-Language Information Retrieval

نویسندگان

  • Hee-Cheol Seo
  • Sang-Bum Kim
  • Baeg-Il Kim
  • Hae-Chang Rim
  • Sang-Zoo Lee
چکیده

This paper describes KUNLP system for the English-Korean cross-language information retrieval track in NTCIR-3 workshop and some experiments after the workshop. Query translation method based on the bilingual dictionary and the document language corpus was used. To automatically transliterate some proper nouns such as Korean person names, Korean place names, and Korean company names, we have constructed the bilingual biographical dictionary, and collected the corresponding translations of Korean place names and Korean company names. We submitted a monolingual run and three cross-language runs, which used only a description field of each topic as a query. Cross-language runs were classified as to whether query expansion was used and whether manual transliteration was applied. Comparisons between cross-language runs show that query expansion is useful in the English-Korean cross-language information retrieval and transliteration also improves the system performance. And additional experiments after NTCIR-3 workshop show that the Korean query which consists of the best translation equivalents for English query terms is more effective than that consisting of two or more translation equivalents. In addition, including English acronyms and initial words in the Korean query is helpful to retrieve Korean documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

KUNLP System for NTCIR-4 Korean-English Cross-Language Information Retrieval

This paper describes our Korean-English crosslanguage information retrieval system for NTCIR-4. Our system is based on a query translation approach with a bilingual dictionary and co-occurrence information between English terms in English corpus. In this year, we have focused on translation of unknown words. We have expanded the existing bilingual dictionary by gathering some of the Korean-Engl...

متن کامل

NTCIR-4 Chinese, English, Korean Cross Language Retrieval Experiments Using PIRCS

In NTCIR-4 we participated in Korean, Chinese, English monolingual, Chinese-English, EnglishKorean bilingual, and Chinese-Korean cross language (using English as pivot) retrieval tasks based on our PIRCS retrieval system. The query translation approach was employed for CLIR. We combined two MT translations for Chinese-English, and two for English-Korean. For the latter, a webbased entity-orient...

متن کامل

Cross-Language IR at University of Tsukuba: Automatic Transliteration for Japanese, English, and Korean

This paper describes our cross-language information retrieval system for the NTCIR-4 CLIR task. Our system, which follows the query translation approach, uses a compound word translation and transliteration. Transliteration is effective if a query includes foreign words, such as technical terms and proper nouns, spelled out by phonetic alphabets. We apply our method, which was originally propos...

متن کامل

AINLP at NTCIR-6: Evaluations for Multilingual and Cross-Lingual Information Retrieval

In this paper, a multilingual cross-lingual information retrieval (CLIR) system is presented and evaluated in NTCIR-6 project. We use the language-independent indexing technology to process the text collections of Chinese, Japanese, Korean, and English languages. Different machine translation systems are used to translate the queries for bilingual and multilingual CLIR. The experimental results...

متن کامل

NTCIR-5 Chinese, English, Korean Cross Language Retrieval Experiments using PIRCS

In NTCIR-5 our focus is to see if web-assisted query expansion is useful, and to test an EnglishKorean bilingual dictionary. We participated in Chinese, Japanese, Korean and English monolingual retrieval using also web expansion for Chinese and English. We also performed Chinese-English, English-Chinese, English-Korean bilingual, and Chinese-Korean pivot bilingual CLIR. The query translation ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002